Interface apparatus

ABSTRACT

The invention relates to an interface apparatus for making input and output of appliances having display such as computer, word processor, information appliance and television, comprising recognizing means for recognizing the shape or move of the hand of an operator, display means for displaying the features of the shape or move of the hand recognized by the recognizing means as special shape in the screen, and control means for controlling the information displayed in the screen by the special shape displayed in the screen by the display means, wherein the two-dimensional or three-dimensional information displayed in the screen can be selected, indicated or moved only by changing the shape or moving the hand, so that the interface apparatus of very excellent controllability and high diversity may be presented.

This Application is a U.S. National Phase Application of PCTInternational Application PCT/JP96/01124.

BACKGROUND OF THE INVENTION

The present invention relates to an interface apparatus for input andoutput of information apparatus such as computer and word processor andappliance having a display such as television.

In a kind of conventional interface apparatus, it is designed to displaya cursor at a coordinate position detected by the mouse on a displayscreen, for adding some other information to the information in thedisplay device, or changing or selecting the displayed information.

FIG. 30 shows an outline of this conventional interface apparatus. InFIG. 30, reference numeral 501 denotes a host computer, and 502 is adisplay, and virtual operation buttons 503, 504, 505 are displayed inthe display 502 by the host computer 501. Reference numeral 506represents a mouse cursor, and the host computer 501 controls thedisplay so as to move in the screen in synchronism with the move of themouse 507, on the basis of the moving distance of the mouse 507 detectedby the mouse 507. As the user moves the mouse 507, the mouse cursor 506is moved to the position of a desired virtual operation button in thedisplay screen, and by pressing a switch 508 on the mouse 507, anoperation button is selected so as to instruct action to the hostcomputer 501.

In this conventional construction, however, the mouse or the inputdevice is necessary in addition to the main body of the appliance, and atable or area for manipulating the mouse is also needed, which is notsuited to portable information appliance or the like. Besides, bymanipulation through the mouse, it is not a direct and intuitiveinterface.

SUMMARY OF THE INVENTION

It is an object of the invention to present an interface apparatuscapable of manipulating an appliance easily without requiring inputdevice such as keyboard and mouse. It is other object thereof to presentan interface apparatus further advanced in the ease of manipulation ofindicating or catching the display object by judging interactions alongthe intent of the operator sequentially and automatically.

In structure, the invention provides an interface apparatus comprisingrecognizing means for recognizing the shape of a hand of an operator,display means for displaying the features of the shape of the handrecognized by the recognizing means on the screen as a special shape,and control means for controlling the information displayed in thescreen by the special shape displayed in the screen by the displaymeans, whereby the information displayed in the screen can be controlledonly by varying the shape of the hand.

It is a further object to present an interface apparatus much superiorin ease of manipulation by recognizing also the move of the hand. Torecognize the move, a frame memory for saving the image picking up theshape or move of the hand, and a reference image memory for storing theimage taken before the image saved in the frame memory as referenceimage are provided, and it is achieved by depicting the differencebetween the image in the frame memory and the reference image stored inthe reference image memory. In other method of recognition, the shape ormove of the hand of the user in the taken image is depicted as thecontour of the user, and its contour is traced, and the relation betweenthe angle of the contour line and the length of contour line, that is,the contour waveform is calculated and filtered, and the shape waveformexpressing the specified shape is generated.

Moreover, comprising cursor display means for displaying a feature ofthe shape of a hand on the screen as a special shape and manipulating ascursor, means for storing the relation with display object other thancursor displays the coordinates and shape of the representative pointrepresenting the position of the display object other than cursordisplay, and means for calculating and judging the interaction of thecursor display and the display object, manipulation is realized smoothlyby the interactions along the intent of the operator when gripping thedisplayed virtual object in the case of display of cursor display asvirtual manipulator.

In the interface apparatus thus constructed, as the user faces therecognizing means and shows, for example, a hand, the special shapecorresponding to the shape of the hand is displayed as an icon in thescreen for screen manipulation, so that control according to the icondisplay is enabled.

Or when instructed by hand gesture, the given hand gesture is displayedas a special shape set corresponding to the shape of the hand on thedisplay screen, and its move is also displayed, and, for example, avirtual switch or the like displayed on the display screen canbe'selected by the hand gesture, or the display object displayed on thescreen can be grabbed or carried depending on the purpose, and thereforewithout requiring mouse or other input device, a very simplemanipulation of appliance is realized.

It is further possible to realize the interface much enhanced in theease of manipulation by sequentially and automatically judging theinteraction with the display object desired to be operated by thevirtual manipulator according to the intent of operation of theoperator, as the special shape set corresponding to the shape of thehand works as virtual manipulator aside from the mere cursor.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an appearance drawing of an interface apparatus in a firstembodiment of the invention;

FIG. 2 is a detailed block diagram of the interface apparatus in thesame embodiment of the invention;

FIG. 3 is a diagram showing an example of shape of hand judged by theinterface apparatus in the same embodiment of the invention;

FIG. 4 is a diagram showing an example of shape identifying means of theinterface apparatus in the same embodiment of the invention;

FIG. 5 is a diagram showing an example of operation by an imagedifference operation unit in the same embodiment;

FIG. 6 is a diagram showing an example of icon generated by an icongenerating unit in the same embodiment;

FIG. 7 is an appearance drawing showing an operation example of theinterface apparatus of the same embodiment;

FIG. 8 is an appearance drawing of an interface apparatus in a secondembodiment of the invention;

FIG. 9 is detailed block diagram of the interface apparatus in thesecond embodiment of the invention;

FIG. 10 is a diagram showing an example of shape of hand judged by theinterface apparatus of the same embodiment;

FIG. 11 is a diagram showing an example of motion recognizing unit ofthe interface apparatus of the same embodiment;

FIG. 12 is a diagram showing an example of operation by an imagedifference operation unit in the same embodiment;

FIG. 13 is a diagram showing an operation example of the sameembodiment;

FIG. 14 is a detailed block diagram of an interface apparatus in a thirdembodiment of the invention;

FIG. 15 is a diagram showing an example of motion recognizing unit ofthe interface apparatus in the third embodiment of the invention;

FIG. 16(A) to (D) are diagrams showing examples of icon displayed on adisplay screen by the interface apparatus of the same embodiment;

FIG. 17 is a diagram showing operation of motion recognizing unit of theinterface apparatus in the same embodiment of the invention;

FIG. 18 is a diagram showing operation of motion recognizing unit of theinterface apparatus in the same embodiment of the invention;

FIG. 19 is a diagram showing operation of motion recognizing unit of theinterface apparatus in the same embodiment of the invention;

FIG. 20 is a diagram showing operation of motion recognizing unit of theinterface apparatus in the same embodiment of the invention;

FIG. 21 is a diagram showing an interface apparatus explaining a fourthembodiment;

FIG. 22(a) is a diagram showing an open state of cursor in an example ofa cursor used in the interface apparatus of the same embodiment;

(b) is a diagram showing a closed state of the same embodiment;

(c) is a diagram showing an open state of cursor in an example of acursor used in the interface apparatus of the same embodiment;

(d) is a diagram showing a closed state of the same embodiment;

(e) is a diagram showing an open state of cursor in an example of acursor used in the interface apparatus of the same embodiment;

(f) is a diagram showing a closed state of the same embodiment;

FIG. 23(a) is a diagram showing the shape of an example of a virtualobject used in the interface apparatus of the same embodiment;

(b) is a diagram showing the shape of other example of a virtual objectused in the interface apparatus of the same embodiment;

FIG. 24(a) is a front view showing configuration of cursor and virtualobject in a virtual space;

(b) is a side view showing configuration of cursor and virtual object ina virtual space;

FIG. 25 is a diagram showing a display example of virtual space forexplaining the embodiment;

FIG. 26 is a block diagram showing an example of the interface apparatusof the same embodiment;

FIG. 27(a) is a diagram showing an example of input device in inputmeans used in the interface apparatus of the same embodiment;

(b) is a diagram showing an example of input device in input means usedin the interface apparatus of the same embodiment;

(c) is a diagram showing an example of input device in input means usedin the interface apparatus of the same embodiment;

FIG. 28(a) is a diagram showing an example of image of a hand taken by acamera in the same embodiment;

(b) is a diagram showing a binary example of image of a hand taken by acamera in the same embodiment;

FIG. 29(a) is a diagram showing an example of image displayed by displaymeans used in the interface apparatus in the same embodiment of theinvention;

(b) is a diagram showing a second example of the display screen;

(c) is a diagram showing a third example of the display screen;

(d) is a diagram showing a fourth example of the display screen; and

FIG. 30 is an explanatory diagram for explaining a conventionalinterface apparatus.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

(First embodiment)

A first embodiment of the invention relates to an interface comprisingrecognizing means such as image pickup device for recognizing the shapeof a hand of the operator, display means for displaying the feature ofthe shape of the hand recognized by the recognizing means on a screen asa special shape by an icon or the like, and control means forcontrolling the information displayed on the screen by varying the shapeof the hand by operating the special shape such as icon displayed on thescreen by the display means as the so-called cursor.

FIG. 1 shows the appearance of the first embodiment of the interfaceapparatus of the invention. Reference numeral 1 denotes a host computer,2 is a display unit, and 3 is a CCD camera for picking up an image. TheCCD camera 3 has the pickup surface located in the same direction as thedisplay direction, so that the shape of the hand of the user can bepicked up when the user confronts the display screen. On the display,menu 201, 202, and icon 200 reflecting the shape of the hand aredisplayed.

FIG. 2 is a detailed block diagram of the invention. The image fed infrom the CCD camera is stored in a frame memory 21. In a reference imagememory 25, a background image not including person taken previously isstored as reference image. The reference image may be updated wheneveras required.

Shape identifying means 22 depicts the difference of the image saved inthe frame memory and the image stored in the reference image memory, andremoves the background image from the image, depicts, for example, theportion corresponding to the hand of the user, and judges if the shapeis, for example, one finger as shown in FIG. 3(A), two fingers as shownin FIG. 3(B), or three fingers as shown in FIG. 3(C).

FIG. 4 shows a detailed example of the shape identifying means 22, whichcomprises an image difference operation unit 221, a contour depictingunit 222, and a shape identifying unit 223.

The image difference operation unit 221 calculates the difference of theimage saved in the frame memory and the image stored in the referenceimage memory as mentioned above. As a result, the object to be detected,for example, the user, can be separated from the background portion. Forexample, when the image difference operation unit 221 is composed of asimple subtraction circuit, as shown in FIG. 5, only the portion of thehand of the user in the image in the frame memory can be depicted. Thecontour depicting unit 222 depicts the contour shape of the objectexisting in the image as a result of operation by the image differenceoperation unit 221. As a practical method, for example, by depicting theedge of the image, the contour shape may be easily depicted.

The shape identifying unit 223 identifies specifically the contour shapeof the hand depicted by the contour depicting unit 222,and judges if theshape is, for example, one finger as shown in FIG. 3(A) or two fingersas shown in FIG. 3(B). As the shape identifying method, for example,template matching, matching technique with shape model, and neuralnetwork may be employed, among others.

An icon generating unit 24 generates an icon image as a special shape tobe shown in the display, on the basis of the result of identifying thehand shape by the shape identifying unit 223. For example, when theresult of identifying the shape of the hand was one finger, an icon ofnumeral “1” is generated as shown in FIG. 6(A), or in the case of twofingers, an icon of numeral “2” is created as in FIG. 6(B). As the shapeof the icon, alternatively, when the result of identifying the shape ofthe hand was one finger, an icon of one finger may be shown as shown inFIG. 6(C), or in the case of two fingers, an icon of two fingers may becreated as shown in FIG. 6(D). A display controller 23 controls thedisplay on the basis of the result of identifying the shape of the handby the shape identifying unit 223. For example, while displaying theicon according to the result of identifying, the menu previouslycorresponding to the result of identifying is displayed by emphasis onthe basis of the hand shape identifying result.

In this embodiment of the invention, an example of operations describedbelow. As shown in FIG. 7(A), when the user confronts the appliancehaving the interface apparatus of the invention and points out onefinger, an icon of numeral “1” is shown on the display, and the displayof television on the first menu is shown by emphasis. At this time, byusing sound or voice from the display device in tune with the emphasisdisplay, the attention of the operator may be attracted. Herein, bypointing out two fingers as in FIG. 7(B), an icon of numeral “2” isshown on the display and the display of network on the second menu isshown by emphasis. In this state, by maintaining the same hand shape fora specific time, the second menu is selected, and an instruction isgiven to the host computer so as to display the network terminal. Forselection of menu, sound or the like may be used at the same time. Inthe case of hand shape different from those determined preliminarily asin FIG. 7(C), icon and menu are not shown on the display, and noinstruction is given to the host computer.

Thus, according to the invention, by identifying the shape of the handin the taken image, it is possible to control the computer or applianceon the basis of the result of identifying, and it is possible tomanipulate without making contact from a remote distance without usingkeyboard, mouse or other device. Besides, as the result of identifyingthe shape of the hand is reflected in the screen, the user canmanipulate while confirming the result of identifying, and ease andsecure manipulation is possible.

In this embodiment, this is an example of applying in selection of menu,but by pressing so that the icon display according to a specific shapeof hand may be replaced by picture or message, it is also possible tocontrol display and writing of picture or message.

(Second embodiment)

A second embodiment of the invention relates to an interface apparatuscomprising a frame memory composed at least of an image pickup unit, amotion recognizing unit for recognizing the shape or move of an objectin a taken picture, and a display unit for displaying the shape or moveof the object recognized by the motion recognizing unit, for storing theimage taken by the image pickup unit, and a reference image memory forstoring the image taken before the image saved in the frame memory asreference image, wherein the motion recognizing unit comprises an imagechange depicting unit for depicting the difference between the image inthe frame memory and the reference image stored in the reference imagememory.

FIG. 8 shows the appearance of the second embodiment of the interfaceapparatus of the invention. In FIG. 8, same constituent elements as inthe first embodiment are identified with same reference numerals. Thatis, reference 1 is a host computer, 2 is a display unit, and 3 is a CCDcamera for picking up an image. The CCD camera 3 has an image pickupsurface located in the same direction as the display direction, so thatthe hand gesture of the user can be picked up as the user confronts thedisplay surface. On the display surface of the display unit 2, virtualswitches 204, 205,206, and an icon of an arrow cursor 203 for selectingthe virtual switches are displayed.

FIG. 9 is a block diagram showing a specific constitution of theembodiment. The image fed through the CCD camera 3 is saved in a framememory 21. A preliminarily taken image is stored in a reference imagememory 25 as a reference image. A reference image updating unit 26 iscomposed of a timer 261 and an image updating unit 262, and is designedto update the reference image by transferring the latest image stored inthe frame memory 21 to the reference image memory 25 at a specific timeinterval indicated by the timer 261.

An motion recognizing unit 22 depicts the difference between the imagesaved in the frame memory and the image stored in the reference imagememory, and eliminates the background image from the image, and depictsthe portion corresponding, for example, to the hand of the user, andalso judges if the shape is one finger as shown in FIG. 10(A) or a fistas shown in FIG. 10(B).

FIG. 11 shows a detailed example of the motion recognizing unit 22,being composed of an image difference operation unit 221, a contourdepicting unit 222, a shape change identifying unit 225, and a positiondetector 224.

The image difference operation unit 221 calculates the differencebetween the image saved in the frame memory 21 and the image stored inthe reference image memory 25 as mentioned above. Consequently, theobject desired to be depicted as motion, for example, the hand portionof the user, can be separated from the background portion, and only themoving object image can be depicted at the same time. For example, whenthe image difference operation unit 221 is composed of a meresubtraction circuit, as shown in FIG. 12, the hand portion in thereference image and only the hand portion of the latest image in theframe memory can be depicted, so that only the moving hand portion canbe easily identified. The contour depicting portion 222 depicts theobject existing in the image as the result of operation by the imagedifference operation unit 221, that is, the contour shape of the handportion before moving and after moving. As an example of practicalmethod, by depicting the edge of the image, the contour shape can beeasily depicted.

The shape change identifying unit 225 identifies the detail of thecontour shape of the hand portion after moving being depicted by thecontour depicting unit 222, and judges if the shape is, for example, afinger as shown in FIG. 10(A), or a fist as shown in FIG. 10(B). At thesame time, the position detector 224 calculates the coordinates of thecenter of gravity of the contour shape of the hand portion of the userafter moving.

An icon generating unit 24 generates an icon image to be shown on thedisplay on the basis of the result of identifying the hand shape by theshape change identifying unit 225. As examples of icon image, forexample, when the result of identifying the hand shape is one finger,for example, the arrow marked icon as shown in FIG. 13(A) may begenerated, or in the case of a first shape, the x-marked icon as shownin FIG. 13(B) may be generated. Or, if the identifying result of handshape is two fingers, an icon mimicking two fingers as shown, forexample, in FIG. 13(C) may be generated, or in the case of a first, anicon mimicking the fist may be generated as shown in FIG. 13(D).

A display controller 23 controls the display position of the icongenerated by the icon generating unit 24 on the display 2, and iscomposed of a coordinate transforming unit 231 and a coordinateinverting unit 232. The coordinate transforming unit 231 transforms fromthe coordinates of the taken image into display coordinates on thedisplay 2, and the coordinate inverting unit 232 inverts the lateralpositions of the transformed display coordinates. That is, thecoordinates of the center of gravity in the image of the portioncorresponding to the user's hand detected by the position detector 224are transformed into display coordinates on the display 2, and thelateral coordinates are inverted to display an icon in the display 2. Bythis manipulation, when the user moves the hand to the right, the iconmoves to the right on the display screen, like a mirror action.

In thus constituted embodiment, an example of operation is describedbelow. As shown in FIG. 8, when the user confronts the applianceincorporating the interface apparatus of the embodiment and points outone finger of the hand, the arrow cursor appearing on the display movesto an arbitrary position corresponding to the move of the hand. Then, bymoving the hand to an arbitrary one of the virtual switches 204, 205,206 shown on the display 2, the arrow cursor is moved, and when the handis gripped to form a fist, the one of the virtual switches 204, 205, 206is selected, and an instruction is given to the host computer 1.

In this embodiment, it is designed to recognize the shape and move ofthe object in the taken image, but it may be also designed to recognizeeither the shape or the move of the object in the taken image.

Thus, according to the invention, which comprises the motion recognizingunit for recognizing the shape and/or move of the objectin the takenimage, display unit for displaying the shape and/or move of the objectrecognized by the motion recognizing unit, frame memory for saving theimage taken by the image pickup means, and a reference image memory forstoring the image taken before the image saved in the frame memory asreference image, by depicting the difference between the image in theframe memory and the reference image stored in the reference imagememory in the motion recognizing unit, when the user confronts the imagepickup unit and gives instruction by, for example, a hand gesture, thegiven hand gesture is shown on the display screen, and a virtual switchor the like shown on the display screen can be selected, for example, bythe hand gesture, and a very simple manipulation of appliance withoutrequiring input device such as mouse is realized.

(Third embodiment)

A third embodiment of the invention relates to an interface apparatuscomprising contour depicting means composed of at least an image pickupunit, a motion recognizing unit for recognizing the shape and/or move ofthe hand of the user in the taken image, and a display unit fordisplaying the shape and/or move of the hand of the user recognized bythe motion recognizing unit, thereby depicting the contour of the takenuser image, a contour waveform operation unit for tracing the depictedcontour, and calculating the relation between the angle of the contourline and the length of contour line, that is, the contour waveform, anda shape filter for filtering the contour waveform calculated by thecontour waveform operation unit for generating a shape waveformexpressing a specific shape, whereby composing the motion recognizingunit.

When the user confronts the image pickup unit of thus constitutedinterface apparatus and instructs by hand gesture, the image pickup unitpicks up the user's image. The contour depicting means depicts thecontour of the user's image, and the contour is transformed into anangle of the contour line corresponding to the horizontal line, that is,as a contour waveform, by the contour waveform operation unit, on thehorizontal axis in the length of the contour line starting from thereference point on the contour. This contour waveform is transformedinto a shape waveform expressing the uneven shape of the finger by ashape filter composed of a band pass filter in specified band, forexample, a band pass filter corresponding to uneven surface of finger,and the location of the finger is calculated, and only by counting thenumber of pulses existing in this shape waveform, the number ofprojected fingers, that is, the shape of the hand can be accuratelyjudged. On the basis of the position or shape of the hand, the givenhand gesture is shown on the display screen, and, for example, a virtualswitch shown on the display screen can be selected by the hand gesture,and therefore very simple manipulation of appliance is realized withoutrequiring mouse or other input device.

Moreover, plural shape filters may be composed of plural band passfilters differing in band, and the motion of the user may be judged onthe basis of the shape waveforms generated by the plural shape filters.As a result, plural shapes can be recognized.

Alternatively, plural shape filters may be composed of at least of aband pass filter in the contour waveform shape corresponding toundulations of hand, and a band pass filter in the contour waveformshape corresponding to undulations of fingers. As a result, the image istransformed into a smooth shape waveform reflecting only undulations ofthe hand portion or into the shape waveform reflecting only undulationsof the finger.

The motion recognizing unit may be constituted by comprising coordinatetable for storing the contrast of coordinates of the contour shape ofthe taken image of the user and the contour shape calculated by thecontour shape operation unit, and a coordinate operation unit forcalculating the coordinates of the location of the specified shape inthe taken image, by using the wave crest location position of the shapewaveform and the coordinate table. Hence, the coordinates of the contourshape are calculated, and the coordinates are issued.

The motion recognizing unit may be also composed by comprising a shapejudging unit for counting the number of pulses in the shape waveformgenerated by the shape filter, and the shape of the object may be judgedby the output value of the shape judging unit. It is hence easy to judgewhether hand is projecting two fingers or gripped to form a fist, by thenumber of pulses.

Also the motion recognizing unit may be composed by comprising adifferentiating device for differentiating the shape waveform generatedby the shape filter. By differentiation, the waveform is morepulse-like, and it is easier to count the number of pulses.

The appearance of the embodiment of the interface apparatus of theinvention is similar to the one shown in FIG. 8 relating to the secondembodiment, and same parts as in the second embodiment are explained byreferring to FIG. 8 and FIG. 10, and only other different parts areexplained in FIG. 14 and after.

FIG. 14 is a detailed block diagram of the third embodiment of theinvention. An image fed from the CCD camera 3 is stored in a framememory 31. The motion recognizing unit 32 depicts the portioncorresponding, for example, to the hand of the user from the imagestored in the frame memory 31, and judges if the shape is, for example,one finger as shown in FIG. 10(A), or a fist as shown in FIG. 10(B).

FIG. 15 shows a detail of execution of the motion recognizing unit 32,and its detailed operation is described while referring also to FIG. 16to FIG. 20.

A contour depicting unit 321 depicts the contour shape of the objectexisting in the image. As an example of specific method, the image istransformed into binary data, and by depicting the edge, the contourshape can be easily depicted. FIG. 17(A1) is an example of depictedcontour line, showing the hand projecting one finger.

A contour shape operation unit 322, starting from-start point s in thediagram of the contour shape of the object depicted by the contourdepicting unit 321 as shown in FIG. 17(A1), traces the contour line inthe direction of arrow in the drawing (counterclockwise), depicts theangle θ from the horizontal line of the contour line at each point x onthe contour line as shown in FIG. 19 as the function in terms of thedistance 1 from the start points, and transforms into the waveform shaperegarding the distance 1 as the time axis as shown in FIG. 17(B1), andsimultaneously stores the coordinates of each point on the contour linecorresponding to the distance 1 in a transformation table 324 in a tableform. A shape filter 1 and a shape filter corresponding to referencenumerals 3231 and 3232 respectively are filters for passing the bandcorresponding to the undulations of the hand portion and undulations ofthe finger portion, in the contour waveform shown in FIG. 17(B1).

By the shape filter 1, FIG. 17(B1) is transformed into a smooth shapewaveform reflecting only the undulations of the hand portion as shown inFIG. 17(B11), and by the shape filter 2, it is transformed into theshape waveform reflecting only the undulations of the finger as shown inFIG. 17 (B12), and both are differentiated by differentiating devices3251 and 3252, and finally differential waveforms as shown in FIG.17(B112) and FIG. 17 (B122) are obtained. The shape judging unit 3262judges whether the contour shape of the hand portion is two fingers asshown in FIG. 10(A) or a fist as shown in FIG. 10(B), and at the sametime the coordinate operation unit 3261 calculates the coordinates ofthe center of gravity of the contour shape of the portion of the hand ofthe user. The coordinate operation unit 3261 determines positions 1c1,1c2 of location of large pulse waveforms in the shape differentialwaveform shown in FIG. 17(B112), and transforms into point c1 and pointc2 shown in FIG. 20 by the coordinate transformation table 324, and thecenter of gravity of the hand portion is calculated from the contourlineof the hand portion from point c1 to point c2, and issued as handcoordinates.

The shape judging unit 3262 counts and issues the number of pulsewaveforms corresponding to the finger portion in the shape differentialwaveform in FIG. 17(B122). That is, in the case of FIG. 17(B122), sincethere are two large pulse waveforms corresponding to the portion of thefinger, it is judged and issued as the shape of two fingers as shown inFIG. 10(A). Or when the hand is gripped as shown in FIG. 18(A2), thereis almost no undulation of finger portion, and the output of the shapefilter 2 is a shape waveform without undulations as shown in FIG.18(B22), and hence the output of the differentiating device 3262 is alsoa shape differential waveform without pulse waveform as shown in FIG.18(B222), and the count of pulse waveforms is 0, and it is hence judgedand issued as the fist shape as shown in FIG. 10(B).

As a practical example of composition of the shape judging unit 3262,simple threshold processing method or neutral network maybe employed.

An icon generating unit 34 in FIG. 14 generates an icon image to beshown on the display on the basis of the result of identifying the shapeof the hand by the shape judging means 3262 in FIG. 15. For example, ifthe result of identifying the hand shape is a shape of one finger, forexample, an icon indicated by arrow shown in FIG. 16(A) is created, orin the case of a fist shape, an icon indicated by x mark as shown inFIG. 16(B) is created. A display controller 33 controls the displayposition of the icon generated by the icon generating unit 34 on thedisplay, and is composed of coordinate transforming unit 331 andcoordinate inverting unit 332. The coordinate transforming unit 331transforms from the coordinates of the taken image into displaycoordinates on the display, and the coordinate inverting unit 332inverts the lateral positions of the transformed display coordinates.That is, the coordinates of the center of gravity in the image of theportion corresponding to the user's hand issued by the coordinateoperation unit 3261 in FIG. 15 are transformed into display coordinateson the display, and the lateral coordinates are inverted to display anicon in the display. By this manipulation, when the user moves the handto the right, the icon moves to the right on the display screen, like amirror action.

In thus constituted embodiment, an example of operation is describedbelow. When the user confronts the appliance incorporating the interfaceapparatus of the embodiment and points out one finger of the hand, thearrow cursor of the icon 203 appearing on the display 2 moves to anarbitrary position corresponding to the move of the hand. Then, bymoving the hand to an arbitrary one of the virtual switches 204, 205,206 shown on the display 2, the arrow cursor is moved, and when the handis gripped to form a fist, the one of the virtual switches is selected,and an instruction is given to the host computer 1.

As an example of the icon to be displayed, as shown in FIGS. 16(C), (D),when the shape of the hand itself is formed into an icon, it correspondsto the move of the actual hand, and it is intuitive. More specifically,the images as shown in FIGS. 16(C) and (D) may be entered beforehand, orthe contour shape data of the hand depicted by the contour depictingunit may be contracted or magnified to a desired size and used as anicon image.

Thus, in this embodiment, when the user confronts the image pickup unitof the interface apparatus and instructs, for example, by a handgesture, the image pickup unit picks up the image of the user, anddepicts the contour of the user's image, and transforms into an angle ofcontour line to the horizontal line, that is, into contour waveform, onthe horizontal axis in the length of the contour line starting from thereference point on the contour line. This contour shape is transformedinto a shape waveform expressing the uneven shape of the fingers by theshape filter composed of a band pass filter of specified band, forexample, a band pass filter corresponding to undulations of fingers, andthe position of the hand is calculated, and simultaneously the number ofpulses existing in the shape waveform is counted, so that the number ofprojected fingers, that is, the shape of the shape can be accuratelyjudged. On the basis of the position and shape of the hand, the givenhand gesture is shown on the display screen, and, for example, a virtualswitch shown on the display switch can be selected, and very easymanipulation of appliance is realized without requiring mouse or otherinput device.

(Fourth embodiment)

The foregoing embodiments relate to examples of manipulation ontwo-dimensional images shown on the display screen, whereas thisembodiment relates to manipulation of a virtual three-dimensional imageshown on a two-dimensional display screen.

Generally, assuming to grasp a virtual object in a virtual space byusing a cursor, in a displayed virtual three-dimensional space, thefollowing constitution is considered.

In FIG. 21, reference numeral A1 is an input device, A2 is a cursorcoordinate memory unit, A3 is an object coordinate memory unit, A4 is adisplay device, and A5 is a contact judging unit. FIG. 22(a) and FIG.22(b) show cursors in two-finger manipulator shape that can be expressedfrom the shape of the hand of the operator same as in the foregoingembodiments. FIG. 22(a) shows an open finger state, and FIG. 22(b) aclosed finger state. FIG. 23 shows an example of a virtual object in avirtual space. Suppose the operator acts to grab the virtual object in avirtual three-dimensional space by using a two-finger cursor. FIG. 24(a)and FIG. 24(b) show configuration of cursor and virtual object in thevirtual space when gripping the virtual object by using the cursor. FIG.25 shows the display of the display device A4.

When manipulation of the operator is given to the input unit A1, thecursor coordinates and the two-finger interval of the cursor stored inthe cursor coordinate memory device A2 are updated according to themanipulation. The display device A4 depicts the virtual space includingthe cursor and virtual object by using the information stored in thecursor coordinate memory unit A2 and the position information of thevirtual object stored in the object coordinate memory unit A3. Herein,the contact judging unit A5 calculates whether the cursor and virtualobject contact with each other in the virtual space or not, by using theinformation stored in the cursor coordinate memory unit A2 and theposition information of the virtual object stored in the objectcoordinate memory unit A3. More specifically, the distance betweenplural surfaces composing the cursor and virtual object in the virtualspace is calculated on each surface, and when the virtual objectcontacts between two fingers of cursor, it is judged that the cursor hasgrabbed the virtual object, and thereafter the coordinates of the objectare changed according to the move of the cursor.

In such technique, however, the display by the display device is asshown in FIG. 25 in the case of configuration as shown in FIG. 24(a) or(b), and the operator may misjudge that the coordinates are matchedalthough the cursor and virtual object position are not matched exactlyin the virtual space. Or, in the case of using the three-dimensionaldisplay device or in the case of combined display of FIGS. 24(a) and(b), smooth manipulation is difficult due to difference in the sense ofdistance in the actual space and in the virtual space.

Thus, due to difference between the sense of distance in a virtual spacewhich is a display space and the sense of distance in an actual space,or due to difference between the motion of cursor intended by theoperator and the actual motion of cursor, interaction according to theintent of the operator (in this case, grabbing of the virtual object)cannot be realized smoothly in the interaction of the cursor and virtualobject in virtual space (for example, when grabbing the virtual objectby a virtual manipulator).

In this embodiment, the cursor can be controlled with ease by handgesture or the like by the operator in the virtual space without makingcontact, and presence or absence of occurrence of interaction with thevirtual object in the virtual space is determined not only by thedistance between the cursor in the virtual space and the constituentelement of the virtual space (the surface in the case ofthree-dimensional virtual space), but by inducing interaction also onthe object of which distance in the virtual space is not necessarilyclose by the interaction judging means, the judgment whether the cursorinduces interaction with the virtual object is made closer to the intentof the operator, so that the controllability of the interface may befurther enhanced. It is further possible to induce interaction also onthe object of which distance in the virtual space is not necessarilyclose.

A first constitution of the embodiment is an interface apparatuscomprising display means, input means for changing the position andshape of the cursor displayed in the display means, cursor memory meansfor storing coordinates of a representative point representing theposition of the cursor and the shape of the cursor, object memory meansfor storing coordinates of a representative point representing theposition of display object other than the cursor and shape of thedisplay object, and interaction judging means for judging interactionbetween the cursor and the display object, by using the position andshape of the cursor stored in the cursor memory means and the positionand shape of the display object stored in the object memory means,wherein the interaction judging means is composed of distancecalculating means for calculating the distance between at least onerepresentative point of the cursor and at least one representative pointof the display object, motion recognizing means for recognizing the moveof the cursor or change of the shape, and overall judging means fordetermining the interaction of the cursor and display object by usingthe distance calculated by the distance calculating means and the resultof recognition by the motion recognizing means.

According to this constitution, presence or absence of occurrence ofinteraction between the cursor manipulated by the operator in thevirtual space and the virtual object in the virtual space is determinednot only by the distance between the cursor in the virtual space and theconstituent element of the virtual object (the surface in the case of athree-dimensional virtual space), but the overall judging means judgesthe presence or absence of occurrence of interaction by the distancebetween representative points calculated by the distance calculatingmeans and the motion of the cursor recognized by the motion recognizingmeans, so that interaction may be induced also on the object of whichdistance is not necessarily close in the virtual space.

When the motion recognizing means recognizes a preliminarily registeredmotion in the first constitution, a second constitution may be designedso that the interaction judging means may induce an interaction on thedisplay object of which distance calculated by the distance calculatingmeans is below a predetermined reference.

In the first and second constitutions, by installing move vectorcalculating means for calculating the moving direction and movingdistance of the cursor in the display space to compose the interactionjudging means, a third constitution may be composed so as to determinethe interaction of the cursor and display object by the interactionjudging means on the basis of the moving direction of the cursor andmoving distance of the cursor calculated by the move directioncalculating means.

The third constitution may be modified into a fourth constitution inwhich the interaction judging means generates an interaction when thecursor moving distance calculated by the move vector calculating meansis less than the predetermined reference value.

In third and fourth constitutions, the interaction judging means maygenerate an interaction on the display object existing near theextension line in the moving direction of the cursor calculated by themove vector calculating means, so that a fifth constitution may becomposed.

In the first to fifth constitutions, the interaction judging means maygenerate an interaction when the shape of the cursor and shape of thedisplay object become a preliminarily registered combination, which maybe composed as a sixth constitution.

In the first to sixth constitutions, by composing the interactionjudging means by incorporating shape judging means for recognizing theshape of the cursor and shape of the display object, a seventhconstitution may be constructed so that the interaction judging meansmay generate an interaction when the shape of the cursor and shape ofthe display object recognized by the shape recognizing means coincidewith each other.

In the first to seventh constitutions, by comprising sight line inputmeans for detecting sight light direction, an eighth constitution may becomposed in which the interaction judging means generates an interactionwhen the motion recognizing means recognizes a preliminarily registeredmotion, on the display object near the extension line of the sight lightdetected by the sight line input means.

The eighth constitution may be modified into a ninth constitution inwhich the interaction judging means generates an interaction when thecursor is present near the extension line of the sight line on thedisplay object near the extension line of the sight line detected by thesight line input means and the motion recognizing means recognizes apreliminarily registered motion.

In the first to ninth constitutions, when an interaction is generated,learning means may be provided for learning the configuration of thecursor and the objective display object, and the shape of the cursor andshape of the display object, so that a tenth constitution may becomposed for determining the interaction on the basis of the learningresult of the learning means by the interaction judging means.

The tenth constitution may be modified into an eleventh constitution inwhich the interaction judging means generates an interaction when theconfiguration of the cursor and objective display object, or the shapeof the cursor and shape of the display object may be similar to theconfiguration or shapes learned in the past by the learning means.

Instead of the first to eleventh constitutions, a twelfth embodiment maybe composed in which the interaction judging means may be composed byincorporating coordinate transforming means for transforming thecoordinates from the cursor memory unit and object memory unit to theinput to the distance calculating means.

The twelfth constitution may be modified into a thirteenth constitutionin which the cursor and objective display object may be brought closerto each other when an interaction is generated.

The fourth embodiment is described in detail by referring to thedrawing. FIG. 26 is a block diagram of the interface apparatus of theembodiment.

In FIG. 26, reference numeral 41 is input means, 42 is cursor memorymeans, 43 is object memory means, 44 is display means,45 is interactionjudging means, 45 a is distance calculating means, 45 b is motionrecognizing means, 45 c is overall judging means, 45 d is move vectorcalculating means, 45 e is shape judging means, 45 f is learning means,45 g is coordinate transforming means, and 46 is sight line input means.

In FIG. 26, the operator manipulates input means 41, the cursor memorymeans 42 changes and stores the coordinates and shape of representativepoint representing the position in the virtual space of the cursor, andthe display means 44 shows the cursor and virtual object intwo-dimensional display or three-dimensional display on the basis of thecoordinates and shape of the representative point representing theposition in the virtual space of the cursor stored in the cursor memorymeans 42 and the coordinates and shape of representative pointrepresenting the position in the virtual space of the virtual objectstored in the object memory means 43.

The sight line input means 46 detects the position of the sight line ofthe operator on the display. The distance calculating means 45 acalculates the distance between the cursor and virtual object in thevirtual space on the basis of the coordinates of the representativepoints stored in the cursor memory means 42 and object memory means 43.The motion recognizing means 45 b recognizes the motion of manipulationon the basis of the data stored in the cursor memory means 42 and objectmemory means 43. The move vector calculating means 45 d calculates themoving direction and moving distance of the cursor in the virtual space.The shape judging means 45 e judges whether the shape of the cursor andshape of the virtual object are appropriate for inducing an interactionor not. The learning means 45 f stores the relation of position andshape of the cursor and virtual object when the overall judging means 45c has induced an interaction between the cursor and virtual object, andtells whether the present state is similar to the past state of inducinginteraction or not.

The overall judging means 45 c judges whether the cursor and virtualobject interact with each other or not, on the basis of the distancebetween the cursor and virtual object issued by the distance calculatingmeans 45 a, the result recognized by the motion recognizing means 45 b,moving direction and moving distance of cursor calculated by the movevector calculating means 45 d, the position of sight line detected bythe sight light input means 46, judging result of the shape judgingmeans 45 e, and degree of similarity to the past interaction issued bythe learning means 45 f,and changes the coordinates and shape of therepresentative points of the cursor and virtual object depending on theresult of interaction. The coordinate transforming means 45 g transformsthe coordinates of the cursor and objective virtual object in thevirtual space used in the distance calculation by the distancecalculating means 45 a so that the positions of the two may be closer toeach other when the interaction judging means 45 induces an interaction.

FIGS. 22(a) and (b) show a two-finger manipulator shape in a firstexample of the cursor used in the interface apparatus of the invention.In the diagram, the fingers are opened in (a) and the fingers are closedin (b). FIGS. 22(c) and (d) show a two-finger two-joint manipulatorshape in a second example of the cursor used in the interface apparatusof the invention. In the diagram, the fingers are opened in (c) and thefingers are closed in (d). FIGS. 22(e) and (f) show a five-finger handshape in a third example of the cursor used in the interface apparatusof the invention. In FIG. 22, the hand is opened in (e) and the hand isclosed in (f).

FIGS. 23(a) and (b) show examples of the object in the virtual spaceused in the interface apparatus of the invention, showing a cube in (a)and a plane in (b).

In the interface apparatus thus constituted, the operation is describedbelow. In this embodiment, it is supposed that the operator moves thecursor as shown in FIG. 22 in a three-dimensional virtual space, andmoves by grabbing the virtual object as shown in FIG. 23 existing in thevirtual space.

Manipulation of the operator is effected on the input means 41. In theinput herein, as the input device for feeding information for varyingthe position or shape of the cursor, the means as shown in FIGS. 27(a)to (c), or the camera, keyboard, or command input by voice recognitioncan be used. FIG. 27(a) relates to a mouse, and the cursor ismanipulated by moving the mouse main body or clicking its button. FIG.27(b) relates to a data glove, which is worn on the hand of theoperator, and the cursor is manipulated by reflecting the finger jointangle or position of the data glove in the actual space in the positionand shape of the cursor. FIG. 27(c) relates to a joy stick, and thecursor is manipulated by combination of lever handling and operationbutton. When using a camera, the body or part of the body (for example,the hand) is taken by the camera, and the shape and position of the handare read.

FIG. 28 shows an example of shape depiction when only the hand is takenby the camera. In FIG. 28(a), the hand is taken by the camera. Theluminance of pixels of the image in FIG. 28(a) is converted into binarydata in FIG. 28(b). In FIG. 28(b), it is possible to judge the degree ofopening or closing of the hand by the ratio of the longer side andshorter side of a rectangle circumscribing a black region, and input ofposition and distance is enabled from the coordinates of center ofgravity and area of the entire black pixels. The input means 41 sendsthe manipulation data(cursor moving distance, cursor shape changingamount, etc.) to the cursor memory means 42.

The cursor memory means 42 stores the coordinates and shape of therepresentative point of the cursor in the virtual space stored in thecursor memory means on the basis of the manipulation data sent out bythe input means 41. As the representative point, the coordinates ofcenter of gravity of cursor (X0, Y0, Z0) may be used. Moreover, as therepresentative point, the coordinates of the center of each surfacecomposing the cursor or the coordinates of apex may be also used. As theshape, the two-finger interval d in the case of FIG. 22(a), or theinternal angle θ n of the joint of each finger in the case of FIGS.22(b), (c) (n is a joint number: as θ n becomes smaller, the joint isbent) may be used as the storage information. Moreover, as the shape,the finger tip of each finger or the coordinates of each joint in thevirtual space may be also used.

The object memory means 43 stores the coordinates and shape of therepresentative point of the virtual object in the virtual space shown inFIG. 23 as the object of manipulation. As the representative point, thecoordinates of the center of gravity of virtual object (cube: (X1, Y1,Z1), plane: (X2, Y2, Z2)) are used. Also as the representative point,the coordinates of the center of each surface composing the virtualobject or the coordinates of the apex may be used. As the shape,parameter a showing a predetermined shape is stored (herein cube isdefined as α=1, and plane as α=2). Also as the shape, the coordinates ofapex may be used.

The display means 44 shows the image in two-dimensional display as seenfrom the viewpoint preliminarily assuming the virtual space on the basisof the information of the position and shape of the cursor and virtualobject stored in the cursor memory means 42 and object memory means 43.FIG. 29(a) shows a display example of display means. When the operatormanipulates, the display position or shape of the cursor is changed, andthe operator continues to manipulate according to the display.

The interaction judging means 45 judges if the cursor has grabbed theobject or not (presence or absence of interaction) every time the cursorposition changes, and when it is judged that the cursor has grabbed theobject, the coordinates of the virtual object are moved according to themove of the cursor.

The distance calculating means 45 a calculates the distance between thecoordinates of the center of gravity of the cursor (X0, Y0, Z0) storedin the cursor memory means 42 and the coordinates of the center ofgravity of the virtual object (X1, Y1, Z1), (X2, Y2, Z2) stored in theobject memory means 43.

The motion recognizing means 45 b recognizes the motion of “grab” as thepreliminarily registered motion by using the change of shape of thecursor. In the case of the cursor in FIG. 22(a), the decreasing state ofthe interval d of two fingers is recognized as the “grab” action, and inthe case of the cursor in FIGS. 22(b), (c),the decreasing state of theangle θ n of all fingers is understood as the “grab” action. As thetechnique of recognizing the motion, meanwhile, the time series changesof the parameters representing the shape (such as d and θ n mentionedabove) may be used as the recognizing technique after learning specificmotions preliminarily by using the time series row pattern recognitiontechniques (table matching, DP matching, hidden Markoff model (HMM),recurrent neutral network, etc.). The move vector calculating means 45 dcalculates the moving direction and moving distance of the cursor in thevirtual space by using the changes of the coordinates of the center ofthe cursor (X0, Y0, Z0). For example, the direction and magnitude of thedifferential vector of the coordinates of the center of gravity of thepresent time t (X0, Y0, Z0)t and the coordinates of the center ofgravity of a certain previous time (X0, Y0, Z0)t−1 are used as movingdistance of the cursor.

The shape judging means 45 e judges if the shape of the cursor forstoring in the cursor memory means is proper or not for grabbing thevirtual object in the shape stored in the object memory means (whetherthe cursor shape is appropriate or not for inducing interaction with thevirtual object). Herein, when the value of the parameter a representingthe shape of the object is 1, the cursor finger open state is regardedas an appropriate state. The cursor finger open state is judged, forexample, when the value of d is larger than the intermediate value ofmaximum value of d, that is, dmax, and 0 in the case of the cursor shownin FIG. 22(a), and when all joint angles θ n are greater than theintermediate value of maximum value θ n max and 0 in the case of FIGS.22(b), (c).

When the value of the parameter a expressing the object shape is 0, itis an appropriate state when the interval of finger tips of the cursoris narrow. The finger tips of the cursor are judged to be narrow when,for example, when the value of d is smaller than the intermediate valueof the maximum value of d, d max and 0 in the case of the cursor in FIG.22(a), or when all joint angles θ n are smaller than the intermediatevalue of the maximum value θ n max and 0 in the case of FIGS. 22(b),(c). As the judging method of the shape, incidentally, the parameterexpressing the cursor shape (d or θ n) is stored preliminarily in thestate of the cursor grabbing the virtual object in contact state in thevirtual space, and when the values of the parameters coincide in a rangeof 130%, it may be judged to be appropriate for grabbing action.

The sight line input means 46 detects the sight line of the operator,and calculates the coordinates noticed by the operator on the displayscreen of the display means 44 (coordinates of notice point). As thesight light detecting means, by detecting the direction of the pupil ofthe operator by using a photo sensor such as CCD camera, the noticepoint on the screen is calculated by measuring the position of the headof the operator by using a camera or the like.

The learning means 45 f stores the parameter (d or θ n) showing theshape of the cursor when the overall judging means 45 c judges that thecursor has grabbed the virtual object, the parameter α showing the shapeof the grabbed object, and the relative configuration of the position ofthe cursor and the position of the virtual object (the vector linkingthe center of gravity of the cursor and the center of gravity of thevirtual object), and when the parameter expressing the present shape ofthe virtual object, the parameter expressing the shape of thesurrounding virtual object, and the configuration of the present centerof gravity of the cursor and the center of gravity of the surroundingvirtual object are close to the past state of grabbing the object (forexample, when the parameters and element values of each dimension of thevector expressing the configuration coincide with the past values withina range of ±30%), it is judged to be close to the past situation and 1is issued, and otherwise 0 is issued. As other means of learning,meanwhile, the parameter expressing the shape of the cursor whengrabbing the object in the past, the parameter a expressing the shape ofthe grabbed virtual object, and the relative configuration of theposition of the cursor and position of the virtual object maybe learnedby using neural networks or the like. As the learning items; it may bealso possible to learn together with the configuration of notice pointcoordinates on the screen detected by the sight line detecting means 46and coordinates of the cursor on the display screen.

The coordinates transforming means 45 g transforms the coordinates usedin distance calculation by the distance calculating means when grabbingthe object (when an interaction is caused) so that the distance betweenthe cursor and the objective virtual object in the virtual space may beshorter. For example, supposing the coordinates to be (100, 150, 20) and(105, 145, 50) when the cursor grabs the virtual object, the coordinatestransforming means transforms the Z-coordinate having the largestdifference among the coordinates as shown in formula (1).

Z′=0.8×Z  (1)

where z is the Z-coordinate of the center of gravity of the cursor andvirtual object received by the coordinates transforming means, and Z′denotes the Z-coordinate as the output of the coordinates transformingmeans.

In this case, the value of X-coordinate and value of Y-coordinate arenot changed. Also the values stored in the cursor memory means andobject memory means are not changed, and hence the screen shown by thedisplay means is not changed. By such transformation, if the distance inthe virtual space is remote, when the operator attempts to grab,thereafter the distance between the cursor and virtual object becomesshorter when calculating the distance, so that the distance calculatingmeans can calculate the distance closer to the sense of the distancefelt by the operator.

The overall judging means 45 c judges the occurrence of interaction of“grab” when recognizing the preliminarily registered action of “grab” bythe motion recognizing means 45 b when the distance between the cursorand the virtual object issued by the distance calculating means 45 a isless than a predetermined reference, and thereafter until theinteraction of “grab” is terminated, the values of the coordinates ofthe center of gravity of the grabbed virtual object stored in the objectmemory means 43 are matched with the coordinates of the center ofgravity of the cursor. Herein, the predetermined reference value may bea larger than the actual distance of contact of the cursor and object inthe virtual space. For example, in the case of FIG. 25 (configuration ofFIG. 24), if the distance between the virtual object and cursor is lessthan the reference value of the distance, the action of grabbing by theoperator may be instructed to the input means 1, and when the motionrecognizing means 45 b recognizes the grabbing action, the virtualobject may be grabbed and moved.

Meanwhile, the overall judging means 45 c, if there are plural virtualobjects below the reference of the distance, judges only the objectsbelow the reference (for example, 90 degrees) of the angle formed by theline segment (waved line) linking the cursor and virtual object and themoving direction (arrow) of the cursor calculated by the move vectorcalculating means 45 d as shown in FIG. 29(b), so that the operator canjudge the interaction in consideration of the moving direction of thecursor in the process of manipulation (selecting the highest position ofthe three objects in the diagram). As for the cursor moving distance, ifthe moving distance is longer than the predetermined moving distancereference, interaction does not occur. Thus, when merely moving thecursor, interaction not intended by the operator is not caused.

Moreover, as shown in FIG. 29(c), when plural virtual objects aresatisfying the reference, the virtual object close to the position ofthe notice point detected by the sight line input means 46 is the objectof grabbing by the overall judging means 45 c (in the diagram, theobject at the left side close to the “+” mark indicating the noticepoint is to be selected). As a result, the object can be easily selectedby using the sight line of the operator.

Incidentally, as shown in FIG. 29(d), if there is a close object on thescreen, the virtual object coinciding with the shape of the cursorjudged by the shape judging means 45 e is the object of grabbing by theoverall judging means 45 c (in the diagram, since the finger interval ofthe cursor is narrow, a plane is judged to be appropriate as the objectof grabbing, and hence the plane is selected). As a result, the virtualobject intended by the operator can be selected by the shape of thecursor, and the operator can manipulate easily by corresponding to thecursor shape that can be easily associated when grabbing the virtualobject.

The overall judging means 45 c selects by priority the virtual objectjudged to be similar when the object was grabbed in the past by thelearning means 45 f. As a result, the judgment closer to the pastmanipulation by the operator can be reproduced, and the controllabilitymay be enhanced.

Thus, according to the invention, presence or absence of interactionbetween the cursor manipulated by the operator in the virtual space andthe virtual object in the virtual space is determined not only by thedistance between the cursor and virtual object, but it is determined onthe basis of the action, sight line or past case in the manipulation ofthe operator, so that the controllability may be enhanced in theinterface for interaction with the virtual object by using the cursor inthe virtual space.

The embodiments have been explained by referring to the action ofgrabbing the virtual object by using the cursor as the interaction, butsimilar handling is also possible in other motions, such as indicating(pointing) to the virtual object, collision, friction, impact, andremote control. Similar effects are obtained if the virtual space is atwo-dimensional space or if the display means is a three-dimensionaldisplay. It may be realized by using hardware, or by using the softwareon the computer.

In this way, according to the embodiments, presence or absence ofoccurrence of interaction between the cursor manipulated by the operatorin the virtual space and the virtual object in the virtual space isdetermined not only by the distance between the constituent elements ofthe cursor and virtual object in the virtual space, but the overalljudging means judges presence or absence of occurrence of interaction onthe basis of the distance between representative points calculated bythe distance calculating means and the motion of the cursor recognizedby the motion recognizing means, and therefore an interaction may beinduced also on the object of which distance is not necessarily close inthe virtual space, so that the input and output interface excellent incontrollability may be presented. Moreover, it is not necessary tocalculate the distance of all the constituent elements between thecursor and virtual object in the virtual space as required in theconventional contact judging method, and therefore the quantity ofcalculation is lessened, and the processing speed is enhanced.

Accordingly, the invention can recognize the shape or motion of the handof the operator and display the feature of the shape of the recognizedhand as a cursor on the screen as the special shape, so that theinformation displayed on the screen and the information displayed on thescreen by the shape or motion of the hand can be controlled easily atsuperior controllability.

Moreover, the feature of the shape of the hand is displayed as cursor onthe screen as the special shape, and the relation with other displayobjects than the cursor is judged sequentially and automatically by theinteraction along the intent of the operator, so that the interfacefurther enhanced in the controllability of manipulation of indicating orgripping the display object can be realized.

What is claimed is:
 1. An interface apparatus comprising: recognizingmeans for recognizing a shape of a hand and a given motion of the handof an operator; display means for displaying a visible menu programmedto correspond to the features of the shape of the hand recognized by therecognizing means on a screen; and control means for selecting thevisible menu displayed on the screen based on the features of the shapeof the hand recognized by said recognizing means and for issuing aninstruction based on the recognition of the given motion of the handrecognized by said recognizing means after selecting the menu.
 2. Aninterface apparatus as defined in claim 1, wherein the given motion isthe operator's maintaining the shape of the hand for a specified time.3. An interface apparatus of claim 2, wherein said display meansdisplays said features of the shape of the hand on the screen as aspecial shape.
 4. An interface apparatus comprising: recognizing meansfor recognizing a shape of a hand and a given motion of the hand of anoperator, wherein the given motion is the operator's maintaining theshape of the hand for a specified time; display means for displaying avisible menu programmed to correspond to the features of the shape ofthe hand recognized by the recognizing means on a screen wherein saiddisplay means displays said features of the shape of the hand on thescreen as a special shape and the special shape is an icon representinga numeral; and control means for selecting the visible menu displayed onthe screen based on the features of the shape of the hand recognized bysaid recognizing means and for issuing an instruction based on therecognition of the given motion of the hand recognized by saidrecognizing means after selecting the menu.
 5. An interface apparatus ofclaim 3, wherein the recognizing means recognizes a number of fingers ofthe hand, the display means displays the number of fingers as thespecial shape and the control means selects the visible menu displayedon the screen based on the number of the fingers of the hand recognizedby the recognizing means.
 6. An interface apparatus as defined in claim5, wherein the special shape is an icon representing a number, and thedisplay means emphasizes a part of the menu corresponding to the numberbased on the number of fingers of the hand recognized by the recognizingmeans.